Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CPU usage reduction feature #12614

Merged
merged 1 commit into from Sep 13, 2022
Merged

Implement CPU usage reduction feature #12614

merged 1 commit into from Sep 13, 2022

Conversation

elad335
Copy link
Contributor

@elad335 elad335 commented Sep 6, 2022

Emulation poses a problem when it runs on the host system because the emulated programs are not optimized for the host system which often creates issues. A common pattern is that coded CPU sleep is adjusted to the console's performance and oftentimes reduced to a minimum because it benefits much less when the game is the only application the console is optimized to run. (doesn't need to care about the performance of other applications while it's running). This is especially an issue on older consoles or CELL SPU emulation where CPU sleep is barely supported by the hardware. This means that even when emulating on strong hardware which can be 100 times faster than the original console, CPU usage is often high even when the FPS never drops.
High CPU usage often means fast battery drainage, reduced performance of multitasking, hot hardware temperatures, etc etc.

Implemented a setting to add planned CPU short sleep calls to all the emulation threads when the target FPS is hit to solve this problem. (needs FPS to be locked either by game or RPCS3)
It does this by using per-frame performance analysis and slowly readjusting CPU sleep calls count to be best for your hardware. This means improved experience when using RPCS3 on mobile devices such as the Steam Deck or laptops. Do note that high values for it may cause audio issues or sudden FPS drops, in this case, lower its value. This setting is called "Max CPU Preempt Count" and is found on CPU tab (disabled/0 by default).

rpcs3/Emu/RSX/RSXThread.cpp Outdated Show resolved Hide resolved
@elad335
Copy link
Contributor Author

elad335 commented Sep 11, 2022

Improved it so even higher values can be used without frame drops.

@Darkhost1999
Copy link
Contributor

Darkhost1999 commented Sep 11, 2022

PR
image
Master
image
I haven't yet gotten as far as even testing the setting yet because whoa that's jarring. I'll test now on some lightweight games that I can achieve the targeted fps
(edit reason: My theme might be confrontational. So I'll use a popular one)

@elad335
Copy link
Contributor Author

elad335 commented Sep 11, 2022

You need FPS limiter enabled if it's not capped by the game, also it takes a minute or a few to see results. (slowly reduces CPU usage)
Also this setting is dynamic so you can change it mid-game.

@Darkhost1999
Copy link
Contributor

Understood, I'm just showing the QT layout looks broke. Nice to know this is dynamic. Thank you that'll help too.

@Darkhost1999
Copy link
Contributor

Darkhost1999 commented Sep 11, 2022

RPCS3.log
I'm confused your comment says "disabled" by default so do I need to maybe bump the slider to 1? How do I enable?
I've set framelimiter to 30 after trying ps3 native. oblivion.ini has an ifps_clamp set to 30 by default and I haven't changed that I checked to verify. So my game is limiting the fps. And I'm just supposed to play this game, I get close to a perfect 60 fps with when using unlock FPS patch and auto frame limiter, letting the game magically adjust the slider through time. Is 1 hour long enough?

@elad335
Copy link
Contributor Author

elad335 commented Sep 11, 2022

Well the more CPU preemptions, the more effect this pr has, so 0 is no effect at all.

@elad335
Copy link
Contributor Author

elad335 commented Sep 11, 2022

And another factor is the CPU usage so that needs to be observed too. And 15 minutes is fine as well.

@Darkhost1999
Copy link
Contributor

OBLIVION_BLUS30087.png
It's very lightweight as is. I might need to try a more intense game to get better results.

@Darkhost1999
Copy link
Contributor

My thought pattern is along the lines the lighter it is on the hardware the better the results will be because if you're not able to achieve the fps limit you won't get anything.

@elad335
Copy link
Contributor Author

elad335 commented Sep 11, 2022

It's because it doesn't use SPUs. (0 usage)

@kd-11
Copy link
Contributor

kd-11 commented Sep 12, 2022

In my opinion this implementation is a bit too complicated for the end user who will not understand what the option even does. We can move it to advanced tab, but that doesn't achieve the goal here. We really should have just had something easy to tune, a simple "low power mode" checkbox that limits fps to 30 or 60 and calibrates pre-emptions automatically. Even the description of the CPU pre-emptions doesn't help the average user here, they will simply not use it.
Also, we shouldn't have to pre-empt the same way as the other threads at all. If its not rendering and low-power is selected, just pause it automatically. There is no need to calibrate it the way it is done, it seems a bit overkill. Usually RSX will spin around with idle queue if there is nothing to render anyway.

@elad335
Copy link
Contributor Author

elad335 commented Sep 12, 2022

There must be some user interaction with it, it can't be trimmed down to something like "enabled/disabled" because it can't predict gameplay sections which suddenly are way more demanding than others, in this case an FPS drop can occur until it readjusts itself. But this is a problem the user can solve by looking at the log and seeing what values it actually ends up using after the performance drop and setting it to the global max value. Same for audio issues at high values - something we can't detect but the user can. Also "inactive rendering" detection doesn't really work for audio or loading screens (in loading not much rendering occurs but the PPU/SPU are fully active, so in this case loading is lengthen much if using rendering-based adjustments) either and it breaks down horribly, it needs to be spaced-out equally along the frame time to prevent issues.

@kd-11
Copy link
Contributor

kd-11 commented Sep 12, 2022

There must be some user interaction with it, it can't be trimmed down to something like "enabled/disabled" because it can't predict gameplay sections which suddenly are way more demanding than others, in this case an FPS drop can occur until it readjusts itself. But this is a problem the user can solve by looking at the log and seeing what values it actually ends up using after the performance drop and setting it to the global max value. Same for audio issues at high values - something we can't detect but the user can. Also "inactive rendering" detection doesn't really work for audio or loading screens (in loading not much rendering occurs but the PPU/SPU are fully active, so in this case loading is lengthen much if using rendering-based adjustments) either and it breaks down horribly, it needs to be spaced-out equally along the frame time to prevent issues.

This diminishes the usability drastically. Users are trying to "pick up and play" and will usually not do things like looking at logs to optimize performance. That's something developers do, not gamers.
The reason I point out rendering is that there is a fixed amount of content to draw each frame, we should be able to tell when there is nothing to do and save on cpu power if we're not actively working to put something onto the display.
Basically, new code is always a negative and we need strong positives to balance it out. There is potential in the idea here, I'm just convinced this can be made more accessible to the end user. That said, I'll play around with it and see if this can be improved. It would be a great addition as rpcs3 is power constrained on handhelds.

@elad335
Copy link
Contributor Author

elad335 commented Sep 12, 2022

Made this setting much less obscure and recommends values less than 50 so most of the users won't need to do much thinking with it. Also, maximum value has been reduced to 300.

@elad335
Copy link
Contributor Author

elad335 commented Sep 12, 2022

Improved analysis to make FPS drops much shorter even on high values and made logging more useful for users who do want to tinker with it.

@elad335 elad335 force-pushed the cpu-time branch 2 times, most recently from 90243b0 to 9f0f572 Compare September 12, 2022 15:41
@elad335 elad335 force-pushed the cpu-time branch 5 times, most recently from 83716a6 to ac2720d Compare September 13, 2022 11:40
@elad335
Copy link
Contributor Author

elad335 commented Sep 13, 2022

Fixed the QT bug.
image

@elad335 elad335 force-pushed the cpu-time branch 2 times, most recently from e4994b4 to 3a1a93d Compare September 13, 2022 11:48
@Danke-Boi
Copy link

This could probably even improve performance on some devices that are thermally limited.

Copy link
Contributor

@kd-11 kd-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kd-11 kd-11 merged commit ec7b18d into RPCS3:master Sep 13, 2022
@Nicholas-Steel
Copy link

Nicholas-Steel commented Sep 13, 2022

This seems to cause visual corruption in Ni No Kuni when set to 25 & you're in the Dark Forest area. Here's a save file for testing. Should be immediately apparent, it looks like portions of assets blink out of existence for a frame.

BLUS30947_00002.zip

@elad335
Copy link
Contributor Author

elad335 commented Sep 13, 2022

@Nicholas-Steel Hi, try to set "Allow RSX CPU Preemptions: false" in game config file when using this feature.

@Nicholas-Steel
Copy link

@Nicholas-Steel Hi, try to set "Allow RSX CPU Preemptions: false" in game config file when using this feature.

@elad335 That had no effect. This is what it looks like: https://www.youtube.com/watch?v=0j38bggAkp0

config_BLUS30947.zip (the change elad335 mentioned is not present in this copy of the config but I did test it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants